42 research outputs found

    Developments in the theory of randomized shortest paths with a comparison of graph node distances

    Get PDF
    There have lately been several suggestions for parametrized distances on a graph that generalize the shortest path distance and the commute time or resistance distance. The need for developing such distances has risen from the observation that the above-mentioned common distances in many situations fail to take into account the global structure of the graph. In this article, we develop the theory of one family of graph node distances, known as the randomized shortest path dissimilarity, which has its foundation in statistical physics. We show that the randomized shortest path dissimilarity can be easily computed in closed form for all pairs of nodes of a graph. Moreover, we come up with a new definition of a distance measure that we call the free energy distance. The free energy distance can be seen as an upgrade of the randomized shortest path dissimilarity as it defines a metric, in addition to which it satisfies the graph-geodetic property. The derivation and computation of the free energy distance are also straightforward. We then make a comparison between a set of generalized distances that interpolate between the shortest path distance and the commute time, or resistance distance. This comparison focuses on the applicability of the distances in graph node clustering and classification. The comparison, in general, shows that the parametrized distances perform well in the tasks. In particular, we see that the results obtained with the free energy distance are among the best in all the experiments.Comment: 30 pages, 4 figures, 3 table

    Two betweenness centrality measures based on Randomized Shortest Paths

    Full text link
    This paper introduces two new closely related betweenness centrality measures based on the Randomized Shortest Paths (RSP) framework, which fill a gap between traditional network centrality measures based on shortest paths and more recent methods considering random walks or current flows. The framework defines Boltzmann probability distributions over paths of the network which focus on the shortest paths, but also take into account longer paths depending on an inverse temperature parameter. RSP's have previously proven to be useful in defining distance measures on networks. In this work we study their utility in quantifying the importance of the nodes of a network. The proposed RSP betweenness centralities combine, in an optimal way, the ideas of using the shortest and purely random paths for analysing the roles of network nodes, avoiding issues involving these two paradigms. We present the derivations of these measures and how they can be computed in an efficient way. In addition, we show with real world examples the potential of the RSP betweenness centralities in identifying interesting nodes of a network that more traditional methods might fail to notice.Comment: Minor updates; published in Scientific Report

    SOINTUMELODIASOVITUKSIA JAZZKITARALLE

    Get PDF
    Opinnäytetyö käsittelee jazzin olemusta ja jazzin taustaa joka on pohjana työn pääosaan, sointumelodioiden sovittamiseen. Työhön liittyy samalla jazz-opetuksen pedagogiikkaa käsitteleviä näkemyksiä. Opinnäytetyön jazzsointumelodiaosa käsittelee sointumelodioiden sovittamista kitaralle. Sovittamisen kohteena on tunnettuja jazzstandardeja, joista on valittu kuusi tarkasteltavaksi opinnäytetyöhön. Sovitusten tarkoituksena on tarkastella jazzkitaran mahdollisuuksia soittimena, joka kykenee yksittäisten sävelten lisäksi sointujen ja sointumelodioiden soittamiseen

    Randomized Optimal Transport on a Graph: framework and new distance measures

    Full text link
    The recently developed bag-of-paths (BoP) framework consists in setting a Gibbs-Boltzmann distribution on all feasible paths of a graph. This probability distribution favors short paths over long ones, with a free parameter (the temperature TT) controlling the entropic level of the distribution. This formalism enables the computation of new distances or dissimilarities, interpolating between the shortest-path and the resistance distance, which have been shown to perform well in clustering and classification tasks. In this work, the bag-of-paths formalism is extended by adding two independent equality constraints fixing starting and ending nodes distributions of paths (margins). When the temperature is low, this formalism is shown to be equivalent to a relaxation of the optimal transport problem on a network where paths carry a flow between two discrete distributions on nodes. The randomization is achieved by considering free energy minimization instead of traditional cost minimization. Algorithms computing the optimal free energy solution are developed for two types of paths: hitting (or absorbing) paths and non-hitting, regular, paths, and require the inversion of an n×nn \times n matrix with nn being the number of nodes. Interestingly, for regular paths on an undirected graph, the resulting optimal policy interpolates between the deterministic optimal transport policy (T0+T \rightarrow 0^{+}) and the solution to the corresponding electrical circuit (TT \rightarrow \infty). Two distance measures between nodes and a dissimilarity between groups of nodes, both integrating weights on nodes, are derived from this framework.Comment: Preprint paper to appear in Network Science journal, Cambridge University Pres

    Maximum likelihood estimation for randomized shortest paths with trajectory data

    Get PDF
    Randomized shortest paths (RSPs) are tool developed in recent years for different graph and network analysis applications, such as modelling movement or flow in networks. In essence, the RSP framework considers the temperature-dependent Gibbs–Boltzmann distribution over paths in the network. At low temperatures, the distribution focuses solely on the shortest or least-cost paths, while with increasing temperature, the distribution spreads over random walks on the network. Many relevant quantities can be computed conveniently from this distribution, and these often generalize traditional network measures in a sensible way. However, when modelling real phenomena with RSPs, one needs a principled way of estimating the parameters from data. In this work, we develop methods for computing the maximum likelihood estimate of the model parameters, with focus on the temperature parameter, when modelling phenomena based on movement, flow or spreading processes. We test the validity of the derived methods with trajectories generated on artificial networks as well as with real data on the movement of wild reindeer in a geographic landscape, used for estimating the degree of randomness in the movement of the animals. These examples demonstrate the attractiveness of the RSP framework as a generic model to be used in diverse applications. randomized shortest paths; random walk; shortest path; parameter estimation; maximum likelihood; animal movement modellingpublishedVersio

    Maximum likelihood estimation for randomized shortest paths with trajectory data

    Get PDF
    Randomized shortest paths (RSPs) are tool developed in recent years for different graph and network analysis applications, such as modelling movement or flow in networks. In essence, the RSP framework considers the temperature-dependent Gibbs–Boltzmann distribution over paths in the network. At low temperatures, the distribution focuses solely on the shortest or least-cost paths, while with increasing temperature, the distribution spreads over random walks on the network. Many relevant quantities can be computed conveniently from this distribution, and these often generalize traditional network measures in a sensible way. However, when modelling real phenomena with RSPs, one needs a principled way of estimating the parameters from data. In this work, we develop methods for computing the maximum likelihood estimate of the model parameters, with focus on the temperature parameter, when modelling phenomena based on movement, flow or spreading processes. We test the validity of the derived methods with trajectories generated on artificial networks as well as with real data on the movement of wild reindeer in a geographic landscape, used for estimating the degree of randomness in the movement of the animals. These examples demonstrate the attractiveness of the RSP framework as a generic model to be used in diverse applications. randomized shortest paths; random walk; shortest path; parameter estimation; maximum likelihood; animal movement modellingpublishedVersio

    Päivystystyötä tekevien lääkäreiden hyvinvointi, sairauspoissaolot ja työtapaturmat

    Get PDF
    Hankkeen tavoitteena oli selvittää sairaalalääkäreiden vuositason työai-katrendejä ja tutkia työajan ja päivystyksen yhteyksiä terveyteen ja hy-vinvointiin 12 sairaanhoitopiirissä. Keskimääräinen vuosityöaika ja alle 13 tunnin päivystysvuorojen osuus kaikista päivystysvuoroista nousi vuo-sien 2014‒2018 aikana. Yli 12 ja 24 tunnin työrupeamat olivat yhteydessä lyhyisiin sairauspoissaoloihin, ja päivystysvuorojen suuri lukumäärä lisäsi sekä sairauspoissaolo- että tapaturmariskiä. Yli 48 tunnin viikkotyöajat, yötyö sekä tiheästi toistuvat päivystysvuorot ja lyhyet vuorovälit olivat yhteydessä riittämättömään unen määrään. Pitkä viikkotyöaika lisäsi työn ja muun elämän yhteensovittamisen vaikeuksia. Hankkeen tulosten perusteella suositellaan hyvin pitkien viikkotyöaikojen ja pitkien päivys-tysvuorojen määrän minimoimista sekä yötyön ja lyhyiden vuorovälien määrän pitämistä kohtuullisena

    Drinking Water Quality and Occurrence of Giardia in Finnish Small Groundwater Supplies

    Get PDF
    The microbiological and chemical drinking water quality of 20 vulnerable Finnish small groundwater supplies was studied in relation to environmental risk factors associated with potential sources of contamination. The microbiological parameters analyzed included the following enteric pathogens: Giardia and Cryptosporidium, Campylobacter species, noroviruses, as well as indicator microbes (Escherichia coli, intestinal enterococci, coliform bacteria, Clostridium perfringens, Aeromonas spp. and heterotrophic bacteria). Chemical analyses included the determination of pH, conductivity, TOC, color, turbidity, and phosphorus, nitrate and nitrite nitrogen, iron, and manganese concentrations. Giardia intestinalis was detected from four of the water supplies, all of which had wastewater treatment activities in the neighborhood. Mesophilic Aeromonas salmonicida, coliform bacteria and E. coli were also detected. None of the samples were positive for both coliforms and Giardia. Low pH and high iron and manganese concentrations in some samples compromised the water quality. Giardia intestinalis was isolated for the first time in Finland in groundwater wells of public water works. In Europe, small water supplies are of great importance since they serve a significant sector of the population. In our study, the presence of fecal indicator bacteria, Aeromonas and Giardia revealed surface water access to the wells and health risks associated with small water supplies.Peer reviewe

    Accelerating advances in landscape connectivity modelling with the ConScape library

    Get PDF
    Increasingly precise spatial data (e.g. high-resolution imagery from remote sensing) allow for improved representations of the landscape network for assessing the combined effects of habitat loss and connectivity declines on biodiversity. However, evaluating large landscape networks presents a major computational challenge both in terms of working memory and computation time. We present the ConScape (i.e. “connected landscapes”) software library implemented in the high-performance open-source Julia language to compute metrics for connected habitat and movement flow on high-resolution landscapes. The combination of Julia's ‘just-in-time’ compiler, efficient algorithms and ‘landmarks’ to reduce the computational load allows ConScape to compute landscape ecological metrics—originally developed in metapopulation ecology (such as ‘metapopulation capacity’ and ‘probability of connectivity’)—for large landscapes. An additional major innovation in ConScape is the adoption of the randomized shortest paths framework to represent connectivity along the continuum from optimal to random movements, instead of only those extremes. We demonstrate ConScape's potential for using large datasets in sustainable land planning by modelling landscape connectivity based on remote-sensing data paired with GPS tracking of wild reindeer in Norway. To guide users, we discuss other applications, and provide a series of worked examples to showcase all ConScape's functionalities in Supplementary Material. Built by a team of ecologists, network scientists and software developers, ConScape is able to efficiently compute landscape metrics for high-resolution landscape representations to leverage the availability of large data for sustainable land use and biodiversity conservation. As a Julia implementation, ConScape combines computational efficiency with a transparent code base, which facilitates continued innovation through contributions from the rapidly growing community of landscape and connectivity modellers using Julia. circuitscape, conefor, ecological networks, least-cost path, metapopulation, random walk, randomized shortest pathspublishedVersio

    Työn vaatimukset prosesseina : Koneoppimismenetelmät ja teoriat tunnistamisen apuna

    Get PDF
    Raportissa kerrotaan Työsuojelurahaston rahoittamasta hankkeesta ”Organisaatiodatan hyödyntämisen luotettavuuden lisääminen teorian ja tekoälyn avulla”. Hanke on teoreettismetodologinen, ja sen tavoitteena oli laatia kaksi kansainvälistä tieteellistä artikkelia. Tutkimusaineisto koostuu kahden ammattikorkeakoulun opettajien Moodle-datasta. Moodle on virtuaalinen oppimisympäristö, jota opettajat käyttävät opetuksen suunnittelua, toteutusta ja arviointia varten. Ensimmäisessä artikkelissa osoitetaan, miten järjestelmien käytössä syntyvä trace data voidaan kiinnittää prosessiteorian viitekehykseen (Langley ym., 2013). Näin ollen tulee mahdolliseksi havaita erilaista toiminnan ajallista etenemistä eli prosesseja ja tehdä päätelmiä työn tekemisen prosessien kuormittavista ajanjaksoista. Päätelmien apuna käytetään Karasekin työn vaatimukset-hallinta -mallia (1979). Toisessa artikkelissa jatketaan analyysiä tarkemmalle tasolle ja käytettiin laskennallisena menetelmänä klusterointia, jolla tunnistetaan kuusi erilaista prosessia tehdä opetustyötä keväällä 2019 ja 2020
    corecore